Draft- not for distribution

This analysis examines the neighborhood characteristics and housing market conditions in Cuyahoga County’s 1161 block groups between 2017 and 2019. We began by gathering parcel- and address-level data from several sources. Property records obtained from the Cuyahoga County Fiscal Officer are primary data source; we also incorporated information from HUD, the Cuyahoga Land Bank, and other sources.

At first blush, it might seem to make the most sense to examine housing markets for a single year. We investigated this possibility, and found that many block groups, even those composed primarily of residential properties, there was simply too little sales activity to obtain stable and meaningful estimates of current market conditions. In any given year there were many block groups with between zero and five arms length property transfers.

By combining information from three consecutive years, we are able to get a more full and accurate sense of market conditions in Cuyahoga County.



After aggregating property records to the block group level, we dropped from the analysis any block groups with fewer than five arms-length transfers from 2017-19. In addition we excluded seven block groups with sufficient sales activity, but that represented extreme outliers on any of the neighborhood or property variables used in the analysis. For example, the density of residential housing units in the two block groups covering Lakewood’s Gold Coast area are more than double the next densest areas of the county.



Identifying underlying neighborhood dimensions (PCA)

We then identified the selection of variables shown in Table 1 which, together, offer a well-rounded view of a neighborhood and it’s market conditions.

Table 1: Block Group Variables Used in Analysis
Variable Definition
mn_alpricesft_adj Mean arms-length sales price per square foot (inflation adjusted)
pct_w_sales Percent of residential properties that sold at least once from 2017-19
var_alpricesft Variation in sales price per square foot (standard deviation in price / mean price)
pct_foreclosed Percent of residential properties that went through foreclosure process
pct_pv Percent of residential properties with a postal vacancy for at least one quarter
res_density Residential Density: residential properties/square mile
pct_own_occ Percent of single-family, multi-family, and condo parcels claiming the owner-occupant tax credit
med_age Median age of residential properties (as of 2019)
med_sqft_per_unit Median square feet per housing unit (i.e., for a double, total square footage / 2)
mn_bath_perunit Average number of baths per housing unit
mn_beds_perunit Average number of beds per housing unit
pct_sf Percent of residential properties that are single family homes
pct_mf Percent of residential properties that are multi-family homes (2-3 housing units)
pct_avg_plus Percent of residential properties assessed as in ‘Fair’ condition or better by the county auditor
ct_demos Count of residential demolitions from 2017-19
comm_density Commercial density: commercial properties per square mile
hcv_density Count of housing units that accepted a housing choice voucher per square mile
avg_ch_alprice Average change in mean arms length sale ($/sqft) from 1012-1719


The variables selected are related to one another in different, and often in multiple ways. Some of the relationships are obvious– e.g. homes with lots of bedrooms tend to have lots of bathrooms. In other cases, the connections are less straightforward, or might be driven by the interrelationships of several variables together.

For example, pockets of infill development in areas like Ohio City/Tremont might be described by relatively new homes (med_yrbuild), high commercial and residential density (comm_density; res_density) and low rates of homeownership (pct_own_occ). At the same time, newer homes might also be an important feature of the county’s outlying suburbs (think Strongsville, North Royalton); however, in this case, med_yrbuilt would be associated with low levels of commercial/residential density and high homeownership rates. Principal component analysis (PCA) is a method for teasing out these sorts of underlying relationships, and distilling them down into a much smaller number of combined variables (often referred to as components or factors).

We applied the variables in Table 1 to a PCA, and identified four combined measures that describe different underlying neighborhood dimensions. Together, these four dimensions hold approximately 80% of the explanatory power of the original 17 variables.

This table shows the component loadings of the four neighborhood dimensions.

This table shows the correlations of the four neighborhood dimensions with the individual variables used to create them, color-coded according to the strength of the association.



Map Dimensions by block group

These maps show the distribution of the four neighborhood dimensions, rescaled as percentile ranks.

Distress

Higher values represent a struggling neighborhood with a distressed housing stock and vulnerable population.

Housing size/quality

Higher values represent low density, primarily residential land use, and a relatively pricey housing stock composed of larger homes with big yards.

Suburban zoning/land use

High values indicate a high density of single family homes, to the exclusion of commercial, or any other land use.

Market strength/resliency

This index describes the strength of the housing market- defined by high rates of sales and price appreciation in the last decade.





Cluster Analysis

The next step was to use these four neighborhood dimensions to categorize block groups into different and meaningful neighborhood ‘types.’

This analysis identified a solution with 6 neighborhood types. However, to get more variation, we’re instead going to choose the best 8-cluster solution.


Examine Clusters


The average attributes of the block groups in each neighborhood type are shown in Table 3. For the four neighborhood dimensions/indices, negative/positive values are shaded pink/blue, to help illustrate the key differences between each type of neighborhood.


Table 4 shows the same information as Table 3; however, cells are filled in with the rank of the neighborhood type on each measure, from 1 = lowest average value among all neighborhood types, and 8 the highest. To help illustrate the defining features of each neighborhood type, the two lowest- and highest-ranked neighborhood types are shaded.



Distribution of Neighborhood Types


All block groups
Strongest examples of each neighborhood type



The clustering algorithm provides several pieces of information that provide insight into how it arrived at a particular solution. One thing it offers us is an ‘uncertainty score’ for each observation (block group). This is a number ranging from 0 to 1, with scores closer to 0 reflecting a high degree of confidence that a given block group was placed into the correct bucket, and scores close to 1 indicating that a block group shared features associated with 2 or more different clusters, and didn’t fit in neatly anywhere.

In this map, for each neighborhood type I selected the 40 strongest examples of that type- i.e., the block groups with the 40 lowest uncertainty scores.